Statistical Techniques for Automatically Inferring the Semantics of Verb-Particle Constructions
نویسنده
چکیده
This paper describes an investigation of some potential features for a statistical approach to inferring the semantics of verb-particle constructions from corpus data. Verb-particles cause particular problems for the computational semantic analysis of language, because their meaning often cannot be derived through the usual compositional methods of analysis. Two novel techniques are presented which promise to provide information about the nature and extent of composition. The first of these measures the extent to which the verb or particle of any given verb-particle may be replaced with a verb or particle of a similar semantic class to form other verb-particles that are attested in the data. The intuition here is that if it reflects systematic patterns in this way then it is more likely that the verb or particle concerned have their simplex meaning. The second technique measures the degree of semantic relatedness between the verb-particle and its component verb. The intuition here is that if a verb-particle is semantically similar to the verb then it is more likely that the verb contributes its simplex meaning. These two features are then combined and used as training data for a classifier using appropriately annotated data.
منابع مشابه
A Statistical Approach To The Semantics Of Verb-Particles
This paper describes a distributional approach to the semantics of verb-particle constructions (e.g. put up, make off ). We report first on a framework for implementing and evaluating such models. We then go on to report on the implementation of some techniques for using statistical models acquired from corpus data to infer the meaning of verb-particle constructions.
متن کاملPicking them up and Figuring them out: Verb-Particle Constructions, Noise and Idiomaticity
This paper investigates, in a first stage, some methods for the automatic acquisition of verb-particle constructions (VPCs) taking into account their statistical properties and some regular patterns found in productive combinations of verbs and particles. Given the limited coverage provided by lexical resources, such as dictionaries, and the constantly growing number of VPCs, possible ways of a...
متن کاملAutomatic Identification Of English Verb Particle Constructions Using Linguistic Features
This paper presents a method for identifying token instances of verb particle constructions (VPCs) automatically, based on the output of the RASP parser. The proposed method pools together instances of VPCs and verb-PPs from the parser output and uses the sentential context of each such instance to differentiate VPCs from verb-PPs. We show our technique to perform at an F-score of 97.4% at iden...
متن کاملClassifying Particle Semantics In English Verb-Particle Constructions
Previous computational work on learning the semantic properties of verb-particle constructions (VPCs) has focused on their compositionality, and has left unaddressed the issue of which meaning of the component words is being used in a given VPC. We develop a feature space for use in classification of the sense contributed by the particle in a VPC, and test this on VPCs using the particle up. Th...
متن کاملVerb-Particle Constructions in the World Wide Web
In this paper we investigate the phenomenon of verb-particle constructions, discussing their characteristics and their availability for use with NLP systems. Combinations automatically extracted from corpora greatly improve the coverage of available resources. However, the data sparseness problem is particularly acute for these constructions and even using a corpus as large as the British Natio...
متن کامل